README file for "Bias in self-reported voting and how it distorts turnout models: disentangling nonresponse bias and overreporting among Danish voters"

#Datasharing: 

The raw data for this paper is stored on servers at Statistics Denmark. Due to security and privacy reasons, the data cannot be made publicly available and we as researcher do not have permission to extract or share micro data. 

Researchers who want to gain access to micro data can only be granted so by Statistics Denmark. Statistics Denmark has created detailed step-by-step descriptions of how such access is granted. The description, as of December 2018, is also uploaded to the Dataverse and available with Statistics Denmark: 

https://www.dst.dk/-/media/Kontorer/13-Forskning-og-Metode/Step-by-step-procedures-for-researchers-access-to-Microdata_082018.pdf?la=en


While data replication cannot be provided, but will have to be applied through the process laid out by Statistics Denmark, our scripts-files are stored on Political Analysis�s dataverse: 
https://dataverse.harvard.edu/dataverse/pan.
Due to time and budget constraints connected with the acquisition of
data material from Statistics Denmark, this material could not be
replicated by PA staff and instead was inspected as part of a
high-level code review.

We welcome any inquiries with respect to data access. 

#Overview of replication files

datasharing.pdf contains the same statement on datasharing as written above. 

overreporting_weights.do merges data from administrative registers with data from the Danish National Election Studies. It creates all results in the paper and the supporting information. Results are saved in table format. 
The runtime in Stata 15 for Windows 64-bit on Statistics Denmark's server with two 2.2 GHz processors with 44 cores and 768 GB RAM: 1938 sec
It creates results for all tables and figures in both the paper and the supporting information.

overreporting_log.smcl is a log file documenting the execution of overreporting_weights.do. 

figure1_data.txt is an output from overreporting_weights.do. It includes results that are used as plotting data for Figure 1

figures1_data.txt is an output from overreporting_weights.do. It includes results that are used as plotting data for Figure S1

figures2_data.txt is an output from overreporting_weights.do. It includes results that are used as plotting data for Figure S2


table1.txt is an output from overreporting_weights.do. It includes raw table data for Table 1.

table3.txt is an output from overreporting_weights.do. It includes raw table data for Table 3.

tableS1.txt is an output from overreporting_weights.do. It includes raw table data for Table S1.

marginal_nonrespondents_weight.txt is an output from overreporting_weights.do. It includes raw table data for Table S2. Results in Table S2 are rescaled to %

marginal_overreport_weight.txt is an output from overreporting_weights.do. It includes raw table data for Table S3. Results in Table S3 are rescaled to %

marginal_responded_weight.txt is an output from overreporting_weights.do. It includes raw table data for Table 2. Results in Table 2 are rescaled to %

figureS3_data.csv includes names of all Danish municipalities and indicates if turnout data is available for voters in the municipality. 


figure1.R takes figure1_data.txt as data and creates Figure 1 and Table S4. 
Required R packages: ggplot2, dplyr, xtable, readr.
Runtime in RStudio R-version 3.3.2 for Windows 64-bit with Intel i7 2.70 GHz processor and 8 GB RAM: 0.11 sec
RAM used: 72.75

figures1.R takes figures1_data.txt as data and creates Figure S1. 
Required R packages: ggplot2, dplyr, readr.
Runtime in RStudio R-version 3.3.2 for Windows 64-bit with Intel i7 2.70 GHz processor and 8 GB RAM: 0.50 sec
RAM used: 72.06

figures2.R takes figures2_data.txt as data and creates Figure S2. 
Required R packages: ggplot2, dplyr, readr.
Runtime in RStudio R-version 3.3.2 for Windows 64-bit with Intel i7 2.70 GHz processor and 8 GB RAM: 0.56 sec
RAM used: 67.3

figureS3.R takes figureS3_data.csv as data and creates Figure S3. 
Required R packages: ggplot2, dplyr, devtools, mapDK (mapDK is installed in the file).
Runtime in RStudio R-version 3.3.2 for Windows 64-bit with Intel i7 2.70 GHz processor and 8 GB RAM: 10.35 sec
RAM used: 142.4




